370 research outputs found

    Reactive Reinforcement Learning in Asynchronous Environments

    Full text link
    The relationship between a reinforcement learning (RL) agent and an asynchronous environment is often ignored. Frequently used models of the interaction between an agent and its environment, such as Markov Decision Processes (MDP) or Semi-Markov Decision Processes (SMDP), do not capture the fact that, in an asynchronous environment, the state of the environment may change during computation performed by the agent. In an asynchronous environment, minimizing reaction time---the time it takes for an agent to react to an observation---also minimizes the time in which the state of the environment may change following observation. In many environments, the reaction time of an agent directly impacts task performance by permitting the environment to transition into either an undesirable terminal state or a state where performing the chosen action is inappropriate. We propose a class of reactive reinforcement learning algorithms that address this problem of asynchronous environments by immediately acting after observing new state information. We compare a reactive SARSA learning algorithm with the conventional SARSA learning algorithm on two asynchronous robotic tasks (emergency stopping and impact prevention), and show that the reactive RL algorithm reduces the reaction time of the agent by approximately the duration of the algorithm's learning update. This new class of reactive algorithms may facilitate safer control and faster decision making without any change to standard learning guarantees.Comment: 11 pages, 7 figures, currently under journal peer revie

    Columnariose: etiologia, sinais clínicos e envio de amostras para análise laboratorial.

    Get PDF
    bitstream/item/42313/1/DOC109-columnariose.pd

    Reactive Reinforcement Learning in Asynchronous Environments

    Get PDF
    The relationship between a reinforcement learning (RL) agent and an asynchronous environment is often ignored. Frequently used models of the interaction between an agent and its environment, such as Markov Decision Processes (MDP) or Semi-Markov Decision Processes (SMDP), do not capture the fact that, in an asynchronous environment, the state of the environment may change during computation performed by the agent. In an asynchronous environment, minimizing reaction time—the time it takes for an agent to react to an observation—also minimizes the time in which the state of the environment may change following observation. In many environments, the reaction time of an agent directly impacts task performance by permitting the environment to transition into either an undesirable terminal state or a state where performing the chosen action is inappropriate. We propose a class of reactive reinforcement learning algorithms that address this problem of asynchronous environments by immediately acting after observing new state information. We compare a reactive SARSA learning algorithm with the conventional SARSA learning algorithm on two asynchronous robotic tasks (emergency stopping and impact prevention), and show that the reactive RL algorithm reduces the reaction time of the agent by approximately the duration of the algorithm's learning update. This new class of reactive algorithms may facilitate safer control and faster decision making without any change to standard learning guarantees

    Variáveis hematológicas em tambaquis anestesiados com óleo de cravo e benzocaína.

    Get PDF
    O objetivo deste trabalho foi avaliar o efeito anestésico de óleo de cravo e benzocaína sobre os parâmetros hematológicas e a fragilidade osmótica dos eritrócitos em tambaqui (Colossoma macropomum)

    A Comparison of Different Methods for the Detection of a Weak Adhesive/Adherend Interface in Bonded Joints

    Get PDF
    There are three main classes of defect which occur in adhesive joints: complete disbonds, voids or porosity in the adhesive layer, poor cohesion (ie a weak adhesive layer) and poor adhesion (ie a weak interface between the adhesive layer and one or both adherends). The detection of disbonds, voids and porosity generally presents few problems and significant progress has been made towards the development of techniques for monitoring the cohesive properties of the adhesive layer [1]. However, there is no satisfactory method for the detection of a weak interface between the adhesive and the adherend(s) and this remains one of the major challenges in NDE. It is the interlayer which is affected by the common problem of slight contamination due to, for example, grease on the adherend surfaces prior to bonding. The adhesive/adherend interface is particularly important in aluminium-aluminium joints in which an inappropriate interface structure can cause greatly enhanced susceptibility to environmental attack [2]. Inspection of the interlayer is difficult because it is frequently only of the order of 1µm thick, compared with an adhesive layer thickness of the order of 100 µm

    Can energy self‑sufficiency be achieved? Case study of Warmińsko‑Mazurskie Voivodeship (Poland)

    Get PDF
    An analysis was carried out to show whether the Warmińsko-Mazurskie Voivodeship (Poland) could become energy selfsufficient. The technical potential of electricity and heat from renewable sources has been calculated. The calculated values are 6.93 TWh/year of electricity and 15.84 PJ/year of heat—these amounts would ensure the energy independence of the Voivodeship. The Warmińsko-Mazurskie Voivodeship is an example of transformation towards “green” energy, it shows that such transformation is also possible in Poland even in short term. This would reduce air pollution as well as limit the import of energy resources. It is very important, it allows us to think with optimism and implement Poland’s energy transformation towards renewable energy (RE). Additionally, a SWOT analysis of each type of RE in the Warmińsko-Mazurskie Voivodeship was presented. The SWOT analysis makes it possible to identify the strengths, weaknesses, prospects and threats for RE in the Voivodeship and the whole country. It has been found that there is a great interest of investors in RE in the Voivodeship, there is usually a great public support for new energy sources, and the biggest barriers are high investment costs and complicated law in Poland

    Challenges in the management of a patient with Cowden syndrome: case report and literature review

    Get PDF
    We would like to present a patient with a classical phenotype of a rare disorder - Cowden syndrome, its diagnostics and management challenges. A breast surgeon has to be aware of this rare condition when treating a patient with breast manifestations of Cowden syndrome and has to refer the patient to a clinical geneticist for further evaluation. Sequencing of the PTEN gene showed the Asp24Gly mutation. According to the latest literature data, the lifetime risk of breast cancer for Cowden syndrome patients is 81% and surgery is a justified option to reduce the risk of breast cancer. Bilateral risk-reducing mastectomy with immediate reconstruction was performed to eliminate further risk of breast cancer. 3 years after the risk-reducing breast surgery the patient is satisfied with the outcome. This is to our best knowledge the first reported Cowden syndrome case with follow-up data after risk-reducing measures have been taken
    corecore